Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
null (Ed.)We introduce the task ofstory fragment stitching,which is the process of automatically aligning andmerging event sequences of partial tellings of astory (i.e.,story fragments). We assume that eachfragment contains at least one event from the storyof interest, and that every fragment shares at leastone event with another fragment. We propose agraph-based unsupervised approach to solving thisproblem in which events mentions are representedas nodes in the graph, and the graph is compressedusing a variant of model merging to combine nodes.The goal is for each node in the final graph to con-tain only coreferent event mentions. To find coref-erent events, we use BERT contextualized embed-ding in conjunction with atf-idfvector representa-tion. Constraints on the merge compression pre-serve the overall timeline of the story, and the finalgraph represents the full story timeline. We evalu-ate our approach using a new annotated corpus ofthe partial tellings of the story of Moses found inthe Quran, which we release for public use. Ourapproach achieves a performance of 0.63F1scoremore » « less
-
null (Ed.)Determining whether an event in a news article is a foreground or background event would be useful in many natural language processing tasks, for example, temporal relation extraction, summarization, or storyline generation. We introduce the task of distinguishing between foreground and background events in news articles as well as identifying the general temporal position of background events relative to the foreground period (past, present, future, and their combinations). We achieve good performance (0.73 F1 for background vs. foreground and temporal position, and 0.79 F1 for background vs. foreground only) on a dataset of news articles by leveraging discourse information in a featurized model. We release our implementation and annotated data for other researchersmore » « less
-
Recognizing the internal structure of events is a challenging language processing task of great importance for text understanding. We present a supervised model for automatically identifying when one event is a subevent of another. Building on prior work, we introduce several novel features, in particular discourse and narrative features, that significantly improve upon prior state-of-the-art performance. Error analysis further demonstrates the utility of these features. We evaluate our model on the only two annotated corpora with event hierarchies: HiEve and the Intelligence Community corpus. No prior system has been evaluated on both corpora. Our model outperforms previous systems on both corpora, achieving 0.74 BLANC F1 on the Intelligence Community corpus and 0.70 F1 on the HiEve corpus, respectively a 15 and 5 percentage point improvement over previous models.more » « less
-
null (Ed.)Identifying the discourse structure of documents is an important task in understanding written text. Building on prior work, we demonstrate an improved approach to automatically identifying the discourse function of paragraphs in news articles. We start with the hierarchical theory of news discourse developed by van Dijk (1988) which proposes how paragraphs function within news articles. This discourse information is a level intermediate between phrase- or sentence-sized discourse segments and document genre, characterizing how individual paragraphs convey information about the events in the storyline of the article. Specifically, the theory categorizes the relationships between narrated events and (1) the overall storyline (such as Main Events, Background, or Consequences) as well as (2) commentary (such as Verbal Reactions and Evaluations). We trained and tested a linear chain conditional random field (CRF) with new features to model van Dijk’s labels and compared it against several machine learning models presented in previous work. Our model significantly outperformed all baselines and prior approaches, achieving an average of 0.71 F1 score which represents a 31.5% improvement over the previously best-performing support vector machine model.more » « less
An official website of the United States government

Full Text Available